Factor analysis with sampling methods for text dependent speaker recognition
نویسندگان
چکیده
Factor analysis is a method for embedding high dimensional data into a lower dimensional factor space. When data are multimodal we use mixtures of factor analyzers (MFA), which assume statistically independent samples. In speaker recognition, samples are not independent because they depend on the speaker in the utterance. In joint factor analysis and i-vectors, the MFA latent factors are tied at different levels. For example, they can be tied for a segment to extract utterance level information. Tied MFA approaches usually present the drawback that computing the exact posterior of the hidden variables (component responsibilities and latent factors) is unfeasible. For JFA, the preferred approximation consists in computing the responsibilities given a speaker independent GMM and they are fixed during the rest of the process. That implies that the estimated responsibilities for a given sample are independent of the rest of the samples of the utterance not taking into account the shared speaker and channel. We present a novel approximation to jointly estimate responsibilities and latent factors based on sampling the latent factor space. This model differs from previous ones in the hidden variables and parameter estimation; and likelihood evaluation. This approach was tested on the RSR2015 database for text-dependent speaker recognition
منابع مشابه
Text-Dependent Speaker Verification System in VHF Communication Channel
Text-independent speaker verification can reach high accuracy provided that there are sufficient amount of training and test speech utterances. Gaussian mixture model universal background model (GMM-UBM), joint factor analysis (JFA) and identity-vector (i-vector) represent the dominant techniques used in this area in view of their superior performance. However, their accuracies drop significant...
متن کاملALIZE/spkdet: a state-of-the-art open source software for speaker recognition
This paper presents the ALIZE/SpkDet open source software packages for text independent speaker recognition. This software is based on the well-known UBM/GMM approach. It includes also the latest speaker recognition developments such as Latent Factor Analysis (LFA) and unsupervised adaptation. Discriminant classifiers such as SVM supervectors are also provided, linked with the Nuisance Attribut...
متن کاملIn-domain versus out-of-domain training for text-dependent JFA
We propose a simple and effective strategy to cope with dataset shifts in text-dependent speaker recognition based on Joint Factor Analysis (JFA). We have previously shown how to compensate for lexical variation in text-dependent JFA by adapting the Universal Background Model (UBM) to individual passphrases. A similar type of adaptation can be used to port a JFA model trained on out-of-domain d...
متن کاملSpeaker Recognition Using Gaussian Mixtures Models
Speech signal contains several levels of information. At first it contains information about the spoken message. At second level speech signal also gives information about the speaker identity, his emotional state and so on. The task of speaker recognition can be divided into two parts: speaker identification and speaker verification. Speaker identification is answering the question which one o...
متن کاملFeatures and Techniques for Speaker Recognition
This paper aims at providing a brief overview into the area of speaker recognition. Speaker recognition can be classified into text dependent and the text independent methods. The features of speech signal that are being used (or have been used) for speaker recognition are presented briefly in this paper. The commonly used techniques of pattern matching for the decision making process in speake...
متن کامل